psrccensus.RmdThe psrccensus package allows R users at PSRC to download, summarize, and visualize Census, ACS, and PUMS data on common PSRC Census geographies.
Some of the Census, ACS, and PUMS data can be found in our Elmer database: see the internal wiki for more information.
You would want to use this psrccensus R library as opposed to Elmer if: you are working in R, you need a table not in Elmer, and/or you would like a formatted table or map to be produced.
The three most important functions are:
get_acs_recs(geography,state,counties,table.names.years, FIPS, acs.type)
get_decennial_recs(geography, counties, table_codes,year, fips)
get_pums_recs: to be coded later
We have one fancy function that let’s create a map of an ACS variable by tract.
To use the library psrccensus, you first will need to get an api key.
The first time you run this code, you will need to set our Census API Key as an environment variable, if you haven’t done that before. After that you can just get it. This is the website to get a key: https://api.census.gov/data/key_signup.html. Once you run Sys.setenv on the Census API Key you will only need to run Sys.getenv.
library(psrccensus)
#Sys.setenv(CENSUS_API_KEY = 'PUT YOUR KEY HERE')
Sys.getenv("CENSUS_API_KEY")Next you need to decide what tables you would like to download and summarize. This is the hardest part because you have find the correct table code, decide on geography, and which years.
One helpful website for finding the tables on a topic from ACS and Census is: https://censusreporter.org/
Many of our frequently used Census/ACS/PUMS tables are nicely described in our wiki. Note that Census table names in Elmer are given a two character code such as H2. In the R library, to be consistent with tidycensus the codes are padded with zeroes such as H002.
You can also search the api variable lists from ACS and Census.
https://www.census.gov/data/developers/data-sets/decennial-census.html
Generally, ACS 1-year data are available down for geographies with populations of 65,000 or more. So you can easily get 1-year data for counties or the region, for example. Once you want to go down to the tract-level, 5-year ACS data is more appropriate. Decennial Census data is available down to the block level.
The functions are documented on: https://psrc.github.io/psrccensus/reference/index.html
Suppose you wish to tabulate ACS one year data 2019 data for estimates of total people by race and ethnicity, as provided in table B03002 by county. You would use the following function call.
get_acs_recs(geography = 'county',
table.names = c('B03002'),
years=c(2019),
acs.type = 'acs1')## The 1-year ACS provides data for geographies with populations of 65,000 and greater.
## Getting data from the 2019 1-year ACS
## # A tibble: 105 x 11
## GEOID name state variable estimate moe label concept census_geography
## <chr> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr> <chr>
## 1 53033 King County Washington B03002_001 2252782 NA Esti~ HISPAN~ County
## 2 53033 King County Washington B03002_002 2030140 NA Esti~ HISPAN~ County
## 3 53033 King County Washington B03002_003 1302544 3208 Esti~ HISPAN~ County
## 4 53033 King County Washington B03002_004 147822 4678 Esti~ HISPAN~ County
## 5 53033 King County Washington B03002_005 13321 1990 Esti~ HISPAN~ County
## 6 53033 King County Washington B03002_006 424590 7085 Esti~ HISPAN~ County
## 7 53033 King County Washington B03002_007 15702 1831 Esti~ HISPAN~ County
## 8 53033 King County Washington B03002_008 6574 3281 Esti~ HISPAN~ County
## 9 53033 King County Washington B03002_009 119587 8804 Esti~ HISPAN~ County
## 10 53033 King County Washington B03002_010 2639 1744 Esti~ HISPAN~ County
## # ... with 95 more rows, and 2 more variables: acs_type <chr>, year <dbl>
To generate Decennial Census tables for housing units and total population by MSA, you would call the following. Note: the table names are padded with 0s, so you call “H001” as opposed to “H1” as you would in Elmer. Only SF1 tables are currently implemented.
get_decennial_recs(geography = 'msa',
table_codes = c("H001", "P001"),
year = 2010,
fips = c('42660', "28420"))## Getting data from the 2010 decennial Census
## Loading SF1 variables for 2010 from table H001. To cache this dataset for faster access to Census tables in the future, run this function with `cache_table = TRUE`. You only need to do this once per Census dataset.
## Using Census Summary File 1
## Getting data from the 2010 decennial Census
## Loading SF1 variables for 2010 from table P001. To cache this dataset for faster access to Census tables in the future, run this function with `cache_table = TRUE`. You only need to do this once per Census dataset.
## Using Census Summary File 1
## # A tibble: 4 x 6
## GEOID NAME variable value label concept
## <chr> <chr> <chr> <dbl> <chr> <chr>
## 1 28420 Kennewick-Pasco-Richland, WA Metro Area H001001 93041 Total HOUSING ~
## 2 42660 Seattle-Tacoma-Bellevue, WA Metro Area H001001 1463295 Total HOUSING ~
## 3 28420 Kennewick-Pasco-Richland, WA Metro Area P001001 253340 Total TOTAL PO~
## 4 42660 Seattle-Tacoma-Bellevue, WA Metro Area P001001 3439809 Total TOTAL PO~
Let’s say you want to create a map of the tracts in the region for one variable. You can use the function create_tract_map. Here’s any example, mapping non-Hispanic Black or African American population alone by tract:
## Warning: package 'sf' was built under R version 4.0.5
## Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
## Warning: package 'dplyr' was built under R version 4.0.5
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
tract.big.tbl <- psrccensus::get_acs_recs(geography='tract',table.names=c('B03002'),years=c(2019))## Getting data from the 2015-2019 5-year ACS
tract.tbl<-tract.big.tbl %>% filter(label=='Estimate!!Total:!!Not Hispanic or Latino:!!Black or African American alone')
gdb.nm = paste0("MSSQL:server=","AWS-PROD-SQL\\Sockeye",
";database=","ElmerGeo",";trusted_connection=yes")
spn = 2285
tract_layer_name="dbo.tract2010_nowater"
tract.lyr <- st_read(gdb.nm, tract_layer_name, crs = spn)## Reading layer `dbo.tract2010_nowater' from data source
## `MSSQL:server=AWS-PROD-SQL\Sockeye;database=ElmerGeo;trusted_connection=yes'
## using driver `MSSQLSpatial'
## Simple feature collection with 773 features and 19 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 1099353 ymin: -97548.53 xmax: 1622631 ymax: 477101.5
## Projected CRS: NAD83 / Washington North (ftUS)
create_tract_map(tract.tbl, tract.lyr, map.title='Black, non-Hispanic Population',
map.title.position='topleft', legend.title='Black, Non-Hispanic Population',
legend.subtitle='by Census Tract')